Section 1: Data Error Checks

1. Project Coverage

In [19]:
print("Districts: ", raw_data["District"].value_counts().shape[0], 
      "  |  Blocks: ", raw_data["Block"].value_counts().shape[0],
      "  |  Gram Panchayats: ", raw_data[raw_data["GP ID"]==False].shape[0],
      "  |  Villages: ", raw_data[raw_data["Village ID"]==False].shape[0],
      "  |  Surveys: ", raw_data["Village"].count())
Districts:  10   |  Blocks:  63   |  Gram Panchayats:  240   |  Villages:  631   |  Surveys:  43183
In [21]:
display_barh(raw_data['District'].value_counts(),title="Number of Surveys in Each District", size=[6,6])
In [22]:
hh_no_cnsnt = raw_data[raw_data["I have consent from family head/ adult member to proceed with the survey."] == "No"]
hh_no_cnsnt.shape[0]
Out[22]:
1056

2. Total Duplicate records:

Based on the unique combination - District + Block + Gram Panchayat + Village + Household Number

In [23]:
### Data Error Condition 1: Number of Duplicate Records identified using Duplicate Key above
dup_records = raw_data[raw_data["is_duplicate_record"] == True]
dup_records.shape[0]
Out[23]:
581

Who Submitted the Duplicate Records-

In [24]:
dup_records['Volunteer Name'].value_counts()
Out[24]:
Sarita Majhi            52
Manasi Raita            44
Dillip Kumar Sabar      40
Sudam Sabar             36
Manaya Raita            29
Urbashi Rout            25
Asish Kumar Sabar       20
Gayatri Sabar           20
Puspanjali Bishoi       18
Chandra Singh Majhi     18
Laxmi Karjee            17
Rajani Karjee           16
Puspanjali Bhuyan       15
Gouri Karjee            15
Eliya Sabar             13
Premalata Gamango       13
Sujata Bhuyan           13
Dillip Raika            12
Sumanta Raita           11
Pabitra Naik            11
Dambrudhar Dalai        10
Dilip Kumar Nayak        9
Baman Sing Majhi         9
Brundabati Sabar         8
DukhiRam Naik            8
Prabhat Raita            7
Abnaijar Raika           7
Sunthani Gamango         7
Naman Bhuyan             7
Suniel Sabar             6
Prasant Kumar Nayak      6
Subhasmita Biswal        6
Rajeshwari Badaraita     6
Ajaya Patra              6
Jeebita Bhuyan           6
Kulamani Majhi           5
Bibhuti Patra            4
Astina Raika             4
Gouri Shankar Sabar      3
Bairagi Karjee           3
Jihosay Mandal           2
Debashis Nayak           2
Sonu Nayak               2
Bidyutprava Praharaj     2
Rabi Badaraita           1
Padmanabham Sabar        1
Ranjit Karjee            1
Priya Ranjan Pradhan     1
Ebel Raita               1
Mahendra Kumar Naik      1
Aswini Kumar Naik        1
Niraj Nial               1
Name: Volunteer Name, dtype: int64

Removing the duplicate records from analysis...

Number of unique records:

In [25]:
raw_data.drop_duplicates(subset=("District",
                                         "Block",
                                         "Gram Panchayat",
                                         "Village",
                                         "Household number"),
                                  keep='last',
                                  inplace=True)
raw_data.shape[0]
Out[25]:
42602

Number of records without consent

In [26]:
no_cnsnt_col="I have consent from family head/ adult member to proceed with the survey."
hh_no_cnsnt = raw_data[raw_data[no_cnsnt_col] == "No"]
hh_no_cnsnt.shape[0]
Out[26]:
1033
In [27]:
display_bar(hh_no_cnsnt["District"].value_counts(),title="Number of Households not Surveyed", size=[14,3])

Why the consent could not be obtained for survey:

In [28]:
raw_data["Why did you not get permission to do the survey?"].value_counts()
Out[28]:
House locked - family away for few days          575
House locked - family does not live there now    399
Respondent declined                               37
No adult at home                                  22
Name: Why did you not get permission to do the survey?, dtype: int64
In [29]:
display_donut(raw_data["Why did you not get permission to do the survey?"],
              title='Why did you not get permission to do the survey?',
             width=5,
             height=5,
             pct=True)

List of Villages with number of households where survey could not be conducted-

In [32]:
create_download_link(hh_no_cnsnt_pvt,
                     title='Click to Download',
                     filename='List of Households - Survey not conducted.csv',
                    level='Village')

Removing the records without consent from analysis

Number of records with consent-

In [33]:
#raw_data=raw_data.drop(raw_data[raw_data["I have consent from family head/ adult member to proceed with the survey."] == "No"].index)
value.append('Total Households')
print(value)
orig_hh_cnt=raw_data.pivot_table(index=['District','Block','Gram Panchayat','Village'],
                                values=value,
                                aggfunc=np.sum) #Create Copy of Original Household Count
raw_data=raw_data[raw_data[no_cnsnt_col]=="Yes"] #Keep records only with consent
raw_data.shape[0]
['Why did you not get permission to do the survey?-House locked - family away for few days', 'Why did you not get permission to do the survey?-House locked - family does not live there now', 'Why did you not get permission to do the survey?-Respondent declined', 'Why did you not get permission to do the survey?-No adult at home', 'Total Households']
Out[33]:
41569

3. Missing Data in Mandatory Columns:

Observations:
-----
Missing Data in  Caste (Avoiding asking. Ask only if doubtful) :  251 ( 0.6 %)
.....
ST         22167
OBC        12297
SC          3577
General     3277
Name: Caste (Avoiding asking. Ask only if doubtful), dtype: int64
-----

Section 3: Data Analysis

1. Overview

Total Households included in Survey:

In [35]:
total_hh = raw_data.shape[0]
print(total_hh)
41569

Households by Social Category:

Out[36]:
Value %
ST 22167 53.65
OBC 12297 29.76
SC 3577 8.66
General 3277 7.93

Distribution of Households across different socio-economic categories in district

In [38]:
raw_data.pivot_table(index=["District"], 
                        columns=["Caste (Avoiding asking. Ask only if doubtful)"],
                        values=['Total Households'],
                        aggfunc={'Total Households':[np.sum]}).style.apply(highlight_max,axis=1)
Out[38]:
Total Households
sum
Caste (Avoiding asking. Ask only if doubtful) OBC General ST SC
District
GAJAPATI 434 7 8209 60
GANJAM 6370 2703 2518 1573
JHARSUGUDA 676 85 1072 180
KALAHANDI 500 26 3704 1026
KANDHAMAL 5 4 507 96
KEONJHAR 2531 239 1811 345
MAYURBHANJ 891 202 1007 120
NAYAGARH 804 6 513 34
RAYAGADA 1 1 1423 30
SUNDERGARH 85 4 1403 113

2. Availability of TBR

Out[39]:
Total Households % of Total
Availability of TBR
Yes, I have TBR 31096 74.81
Toilet only 1609 3.87
No, never had TBR 7230 17.39
Had TBR but not now 1634 3.93

Distribution of Villages by % of Households having TBR

No of Villages with 100% households having TBR or Toilet:  23  ( 3.65 %)
No of Villages with No(0%) households having TBR or Toilet:  6  ( 0.95 %)

Why there was never a toilet in household?

In [48]:
print("Households which never had a toilet:",hh_by_tbr.loc['No, never had TBR','Total Households'] )
display_donut(raw_data['Why did you never have a Toilet?'],
             title='',
             width=6,
             height=6,
             pct=True)
Households which never had a toilet: 7230
In [49]:
raw_data['Why did you never have a Toilet?'].value_counts()
Out[49]:
New house due to separation from family - Committee did not build    4259
Came to village after GV intervention - Committee did not build       858
Not built during GV intervention                                      721
Money problem                                                         698
No space for a Toilet                                                 161
Other (please specify)                                                148
No consensus in family                                                 50
I dont want a Toilet                                                   11
Name: Why did you never have a Toilet?, dtype: int64

When TBR is now not available in household, but was there earlier, what happened to it?

In [50]:
display_barh(raw_data['What happened to your TBR?'].value_counts(),
              title='Current Status of TBR',
           size=[7,4])

So what is the common practice for defecation in households?

In [51]:
display_donut(raw_data['You do not have a Toilet. Where do you defecate?'],
              title='', 
              width=4, 
              height=4,
             pct=True)

Which are the villages where open defecation is widespread (Top 20) ?

In [52]:
col='You do not have a Toilet. Where do you defecate?'
In [54]:
od_hh_dist=raw_data.pivot_table(index=['District','Block','Gram Panchayat','Village'],
                    values=value,
                    aggfunc=np.sum)
od_hh_dist.fillna(0,inplace=True)
In [55]:
od_hh_dist.sort_values(value[0], ascending=False).head(20)
Out[55]:
Total Households You do not have a Toilet. Where do you defecate?-Open Defecation You do not have a Toilet. Where do you defecate?-Other (please specify) You do not have a Toilet. Where do you defecate?-Other's Toilet
District Block Gram Panchayat Village
GAJAPATI GOSANI LABANYAGADA S. RAUTAPUR 34 8 0 0
KALAHANDI THUAMUL RAMPUR KERPAI TADADEI 33 15 0 0
CHIMRANGPADAR 17 9 0 0
KACHALEKHA 80 53 0 3
KERPAI 56 16 0 3
MAHAJAL 21 21 0 0
MAJHIGAON 90 77 0 0
RANIBALI 12 12 0 0
SERKAPAI 45 42 0 0
MALLIGAON BHITARPADAR 21 4 0 0
KARLAPAT SANDHIMUNDA 12 4 0 0
MALLIGAON DIGIRIBANDHA 48 8 0 0
MALIGAON 56 8 0 0
NAKRUNDI CHULBADI 78 47 0 0
KARNIBEL 21 7 0 0
KUTRUGUDA 12 5 0 0
SIGNI 33 14 0 0
TALAMPADAR 114 35 0 1
KARLAPAT VEZIGUDA 117 29 0 1
KUANG 41 7 0 0

All Villages where ODF is in practice-

In [56]:
create_download_link(od_hh_dist,filename="Distribution of OD Households.csv", level='Village')

3. Water Supply to Households

Households Connected to Water Suppy

Out[57]:
Total Households % of Total
Household Connected with Water Supply System
Yes, Having Water Connection 31019 74.62
Never had water connection 9977 24
Had Water Connection in past but not now 573 1.38

Observations: 25.38% of the households do not have water connection. Out of those 94.57% (i.e. 24% of total) never had the connection

Distribution of Villages by % of Households Not having Water Suppply Connection

No. of Villages with 100% households having Water Supply Connection:  24  ( 3.8 %)
No. of Villages with No (0%) households having Water Supply Connection:  8  ( 1.27 %)

Why did households never have tap connection?

Distribution of households (by %) by major reasons for not getting water supply connection:

In [65]:
print("Total Households which never connected to WSS: ",hh_by_ws.iloc[1,0],"(",hh_by_ws.iloc[1,1],"%)")
Total Households which never connected to WSS:  9977 ( 24.0 %)

a. Across all regions -

In [68]:
value = [col+"-"+x for x in options]
hh_nvr_cnctd_sw = raw_data[raw_data["Did you ever have tap connection (individual pipeline) to your house/ TBR?"]=="No"].pivot_table(index='State',
               values = value,
               aggfunc=np.mean).round(2)
display_bar(hh_nvr_cnctd_sw[value],size=[15,9],title='')

b. In districts

In [69]:
hh_nvr_cnctd_dw = raw_data[raw_data["Did you ever have tap connection (individual pipeline) to your house/ TBR?"]=="No"].pivot_table(index='District',
               values = value,
               aggfunc=np.mean).round(2)

display_bar(hh_nvr_cnctd_dw[value],size=[15,10],title='')

c. In Villages

In [70]:
### Create a Pivot % of such households not receiving water village wise along with reason 
hh_nvr_cnctd_vw = raw_data.pivot_table(index = ['District','Block','Gram Panchayat', 'Village'],
                    values = value,
                    aggfunc = np.mean
                    ).round(2)

create_download_link(hh_nvr_cnctd_vw,title="Download Complete Table",filename="Reasons for No WSS Connection.csv", level='Village')

What is the willingness of households to reconnect the water supply?

Among those households which had a tap connection earlier but not at present.

In [71]:
print("Total Households which wever once connected to WSS but not now: ",hh_by_ws.iloc[2,0],"(",hh_by_ws.iloc[2,1],"%)")
Total Households which wever once connected to WSS but not now:  573 ( 1.38 %)

Among those households which never had a tap connection

In [74]:
print("Total Households which never connected to WSS: ",hh_by_ws.iloc[1,0],"(",hh_by_ws.iloc[1,1],"%)")
Total Households which never connected to WSS:  9977 ( 24.0 %)

Condition of tap connection (pipeline to house/ TBR)

Observations: 9.26% of the household tap connections are not working

Supply of water to households through pipeline

Total Households with Water Supply Connection:  31019
Out[78]:
Water supplied to Households % of Households (with Water Supply Connection) % of Total Households
Yes 26886 86.68 64.68
No 4133 13.32 9.94

How often do the households get the water? (Number of households with get water regularlrly)

Overall Status of Water Supply System in the Village

Out[80]:
Status of Water Supply Total Households % of Total
0 365 days a year but not 24/7 12467 29.99
1 365 days a year and 24/7 7506 18.06
2 Not 12 months but 24/7 5383 12.95
3 Not 12 months and not 24/7 1530 3.68
4 Never get water supply 4133 9.94
5 Never had water connection 9977 24
6 Had Water Connection in past but not now 573 1.38

Distribution of villages by frequency of water supply to households (%)

Out[83]:
<matplotlib.axes._subplots.AxesSubplot at 0x25045682c18>

How often the given % of connected households get the water supply in the village?

Out[84]:
(-0.001, 5.0] (5.0, 10.0] (10.0, 20.0] (20.0, 30.0] (30.0, 40.0] (40.0, 50.0] (50.0, 60.0] (60.0, 70.0] (70.0, 80.0] (80.0, 90.0] (90.0, 100.0]
How often do you get water?-Not 12 months and not 24/7-Binned 508 1 1 3 2 3 2 2 2 0 20
How often do you get water?-365 days a year and 24/7-Binned 350 6 2 0 4 1 0 3 1 2 175
How often do you get water?-365 days a year but not 24/7-Binned 349 1 1 2 4 2 1 4 1 5 174
How often do you get water?-Not 12 months but 24/7-Binned 383 2 4 1 2 2 2 2 3 2 141
Number of Villages having 100% of Households receiving Water-Not 12 months and not 24/7:  15  ( 2.38 %)
Number of Villages having No (0%) Household receiving Water-Not 12 months and not 24/7:  505  ( 80.03 %)
Number of Villages having 100% of Households receiving Water-365 days a year and 24/7:  174  ( 27.58 %)
Number of Villages having No (0%) Household receiving Water-365 days a year and 24/7:  336  ( 53.25 %)
Number of Villages having 100% of Households receiving Water-365 days a year but not 24/7:  155  ( 24.56 %)
Number of Villages having No (0%) Household receiving Water-365 days a year but not 24/7:  346  ( 54.83 %)
Number of Villages having 100% of Households receiving Water-Not 12 months but 24/7:  140  ( 22.19 %)
Number of Villages having No (0%) Household receiving Water-Not 12 months but 24/7:  376  ( 59.59 %)
In [86]:
create_download_link(df=wss_connection_status,
                     filename='List of Villages - % of Households by Frequency of Water Supply.csv',
                     title='Download Complete List',
                     level='Village')

Water shortage month-wise (% of households having a water supply which do not get water in each month)

In [87]:
print("Total Households which get water supply: ",raw_data[raw_data['Do you have water supply to your house/TBR ?']=='Yes'].shape[0])
Total Households which get water supply:  26886
Out[90]:
How often do you get water? 365 days a year and 24/7 365 days a year but not 24/7 Not 12 months and not 24/7 Not 12 months but 24/7
Which months do you not have water supply?-January 0.0 0.0 0.00 0.00
Which months do you not have water supply?-February 0.0 0.0 14.58 3.68
Which months do you not have water supply?-March 0.0 0.0 70.65 57.18
Which months do you not have water supply?-April 0.0 0.0 96.86 94.65
Which months do you not have water supply?-May 0.0 0.0 93.27 94.74
Which months do you not have water supply?-June 0.0 0.0 60.72 56.27
Which months do you not have water supply?-July 0.0 0.0 9.48 7.99
Which months do you not have water supply?-August 0.0 0.0 4.90 1.47
Which months do you not have water supply?-September 0.0 0.0 3.92 1.93
Which months do you not have water supply?-October 0.0 0.0 2.75 1.37
Which months do you not have water supply?-November 0.0 0.0 2.75 0.04
Which months do you not have water supply?-December 0.0 0.0 2.94 0.02

Why do households do not get 24x7 water supply?

Does the entire village get water supply?

Total Households With Water Supply:
Out[92]:
Total Households    31019.00
% of Total             74.62
Name: Yes, Having Water Connection, dtype: float64

Observation: Villages where more than 20% of the connected households do not get water:

129

Villages where over 50% of connected households do not get water:

92

Village where 100% of households receive water:

312

Village where no (0%) household receive water:

79
In [103]:
create_download_link(df=villages_with_no_supply,title="Click to Download",
                     filename='List of Villages-% of HH with Water Supply.csv',
                    level='Village')
Out[103]:

Why is the water not being supplied?

Distribution of hoseholds(%) by major reasons for no supply of water:

a. Across all regions -

b. In districts

c. In Villages (Top 50)

Cells in red denote the village where the issue is most common

% of Households with No Water Supply Reason(s) for no water supply-Caste/ Social Issue Reason(s) for no water supply-Committee decided Reason(s) for no water supply-Elevation issue Reason(s) for no water supply-Low/No pressure in the individual pipeline Reason(s) for no water supply-Non-payment of user fee Reason(s) for no water supply-Other Reason(s) for no water supply-Problem with the distribution pipeline Reason(s) for no water supply-Problem with the individual pipeline Reason(s) for no water supply-Water shortage
District Block Gram Panchayat Village
GAJAPATI GOSANI SABARA PADMAPUR 100 0 0 0 0 0 0 0 100 100
KALAHANDI THUAMUL RAMPUR SINDHIPADAR SIRIMASKA 100 0 0 33.33 66.67 0 0 33.33 0 0
THUAMUL RAMPUR SIMILIPADAR 100 0 0 0 20 0 0 100 0 10
PALIJHAR 100 0 0 0 0 100 0 0 0 0
MOTACHUAN 100 0 0 0 6.67 6.67 0 93.33 0 10
KOSABARA 100 0 0 21.88 6.25 50 0 96.88 0 6.25
KENDUPADA 100 0 0 0 0 8.7 0 91.3 0 8.7
GHATIGUDA 100 0 0 0 0 11.11 0 55.56 0 0
SINDHIPADAR ZILLA GAON 100 0 0 46.15 38.46 0 0 84.62 53.85 23.08
NICHEMASKA 100 0 0 100 50 0 0 50 0 0
ODRI DALGUDA 100 0 0 5.77 0 0 0 96.15 0 0
SINDHIPADAR KANDAJHAPI 100 0 0 0 0 0 0 37.5 0 12.5
BENDAJHOLA 100 0 0 0 0 100 0 0 0 0
ARAKHPURI 100 0 0 0 0 0 0 0 100 0
ODRI TENTULIPADA 100 0 0 0 0 0 0 100 0 0
PINDAUL 100 0 0 0 0 0 0 100 0 0
PENGDHUSI 100 0 0 0 0 0 0 0 0 100
KOKELPADAR 100 0 0 0 0 0 0 100 0 0
KANDHAMAL BALIGUDA RUTUNGIA KADIGANDA 100 0 0 0 0 0 0 0 0 0
SINDHRIGAON NILIPADA 100 0 0 100 0 0 0 0 0 0
DARINGIBADI DARINGIBADI SIRIPANKA 100 0 0 0 0 0 0 100 0 0
K. NUAGAON SIRTIGUDA GASUKIA 100 0 0 100 0 0 0 0 0 0
KEONJHAR JHUMPURA TUKUDIHA BARAHAPOSI 100 0 0 20 20 0 0 0 80 0
NISCHINTAPUR NISCHINTAPUR 100 0 0 42.86 57.14 0 0 14.29 28.57 0
BARIA BHUBANAPOSI 100 0 0 0 30.77 0 0 38.46 0 0
BADANEULI RATNAPOSI 100 0 0 0 33.33 0 0 33.33 33.33 33.33
PODASIMILA 100 0 0 0 0 0 0 0 66.67 66.67
HARICHANDANPUR MANAHARAPUR SATYAPAL 100 0 0 0 0 50 0 0 0 0
JEERANGA TANGARPADA 100 0 100 0 0 0 0 0 0 0
SANTOSHPUR 100 0 0 0 0 0 0 100 0 0
CHAMPUA SUNAPOSHI JADAPOKHARI 100 0 0 0 100 0 0 0 0 0
SADANGI RAJABANDHA 100 0 100 0 0 0 0 39.47 0 0
RANGAMATIA BANKIA 100 0 0 0 0 0 0 0 100 0
PADUA BALIPOSHI 100 0 0 25 0 0 66.67 8.33 0 16.67
BHANDA GULUDIPOSHI 100 0 0 0 33.33 16.67 0 25 75 0
KANDHAMAL K. NUAGAON SIRTIGUDA KUDUPAKIA 100 0 0 0 100 0 0 0 50 0
GUNJIGAON 100 0 0 0 0 25 0 0 0 50
KALAHANDI THUAMUL RAMPUR ODRI DHOLPASS 100 0 0 33.33 9.52 0 0 80.95 0 4.76
NAKRUNDI TARAPADAR 100 0 0 0 0 0 14.29 85.71 0 14.29
KEONJHAR KEONJHAR GOBARDHAN BANUAMAHANTA SAHI 100 0 0 0 0 0 100 0 0 0
KALAHANDI THUAMUL RAMPUR GUNUPUR RANAPUR 100 0 0 0 0 0 0 0 0 0
KANIGUMA DHULIGUDA 100 0 0 0 0 0 0 0 100 0
CHARCHIKINA 100 0 0 0 0 0 0 100 41.18 41.18
BALABHADRA COLONY 100 0 0 0 0 0 0 94.44 0 97.22
GUNUPUR TIKIRAPADA 100 0 0 0 0 0 0 0 0 0
SIMELPADAR-1 100 0 0 0 0 0 0 100 16.22 18.92
SIALIPADAR 100 0 0 6.67 3.33 0 0 93.33 0 3.33
SAISUMI 100 0 0 0 0 0 0 100 0 50
NUNRESH 100 0 0 40 10 0 0 0 50 40
NAKRUNDI TALAMPADAR 100 0 0 90.91 0 0 0 100 0 100

Frequency Distribution of Villages by reason for households(%) not being connected to Water Supply System there

Out[108]:
<matplotlib.axes._subplots.AxesSubplot at 0x2503d7f6048>

Why given % of households have not been connected to the water supply systsme in the village?

Out[109]:
(-0.001, 5.0] (5.0, 10.0] (10.0, 20.0] (20.0, 30.0] (30.0, 40.0] (40.0, 50.0] (50.0, 60.0] (60.0, 70.0] (70.0, 80.0] (80.0, 90.0] (90.0, 100.0]
Reason(s) for no water supply-Problem with the distribution pipeline-Binned 180 4 14 14 14 7 3 8 5 10 52
Reason(s) for no water supply-Other-Binned 254 3 7 0 3 3 5 6 5 5 20
Reason(s) for no water supply-Problem with the individual pipeline-Binned 212 2 14 13 9 16 4 5 4 3 29
Reason(s) for no water supply-Non-payment of user fee-Binned 222 6 9 7 5 17 3 4 5 3 30
Reason(s) for no water supply-Water shortage-Binned 206 14 13 9 13 15 5 7 6 3 20
Reason(s) for no water supply-Elevation issue-Binned 251 9 13 7 8 6 2 3 1 1 10
Reason(s) for no water supply-Low/No pressure in the individual pipeline-Binned 259 6 9 6 9 3 1 2 5 3 8
Reason(s) for no water supply-Committee decided-Binned 295 1 0 1 3 4 1 2 0 0 4
Reason(s) for no water supply-Caste/ Social Issue-Binned 305 1 1 1 0 1 0 0 0 0 2

4. Availability of 3rd Tap Connection in Households

Households having 3rd Tap Connection

Households which opted for TBR and/or 3rd Tap Connection

Out[111]:
Is there a third tap? Yes No
Do you have TBR?
No 923 7941
Toilet only 345 1264
Yes, I have TBR 13300 17796

Households which opted for Water Supply Connection and/or 3rd Tap Connection

Out[113]:
Is there a third tap? Yes No
Do you NOW have tap connection (individual pipeline) to your house/ TBR?
Yes 13727 17292
No 841 9709

What is the Condition of the 3rd Tap installed in Households?

Where have people installed the 3rd Tap preferably?

5. Sources of Drinking Water

Where do people get drinking water from?

% of Households reporting different sources of drinking water

Overall

By Connection to Water Supply

By Availability of 3rd Tap

By District

In [120]:
hh_drnkng_wtr_src = raw_data.pivot_table(index='District',
               values = value,
               aggfunc=np.mean).round(2)
display_bar(hh_drnkng_wtr_src,size=[16,8],title='')

create_download_link(df=villages_with_no_supply_rsn,title="Download Complete List",
                     filename='List of District-Major Sources of Water for Households.csv',
                    level='District')

By Village

5. Disposal of Waste Water

How is waste water from bathroom disposed?

Number of households actually having waste water disposal system

22817  ( 54.89 % )

Where is waste water from the bathroom disposed?

Is the water water flowing out of bathroom properly disposed in a sewer/kitchen/garden/pit?

Out[125]:
Where is waste water from the bathroom disposed? Flows into a common system Kitchen Garden/ Plantation Not to any particular point Other (please specify) Soak pit
How is waste water from bathroom disposed?
No specific provision has been made 1116 319 5734 10 183
Other (please specify) 6 17 37 1274 4
Through a concrete dug channel 1731 1067 344 0 241
Through a piped channel 7949 2674 4842 16 2539

6. Usage of Toilet in the households

Does anyone in family use the toilet?

% of households where toilet is used by any of the family member

By Availability of TBR

Why does the family not use the toilet

Reason for not using the toilet (by % of Households)

Overall

District Wise

Village Wise

Although the households do not use toilet primarily due to lack of water, but what is the status of water supply in such households which are connected with water supply

In [134]:
raw_data[raw_data['Why dont they use the Toilet? (1)-No water']==100]['How often do you get water?'].value_counts()
Out[134]:
365 days a year but not 24/7    506
Not 12 months but 24/7          121
Not 12 months and not 24/7       77
365 days a year and 24/7         54
Name: How often do you get water?, dtype: int64
In [135]:
hh_tlt_no_usg_by_ws = raw_data[raw_data[par_col]=="No"].pivot_table(index=['How often do you get water?'],
               values = value,
               aggfunc=np.mean).round(2)
hh_tlt_no_usg_by_ws
Out[135]:
Why dont they use the Toilet? (1)-Habit Why dont they use the Toilet? (1)-Latrine overflow Why dont they use the Toilet? (1)-No electricity Why dont they use the Toilet? (1)-No toilets at work/ field Why dont they use the Toilet? (1)-No water Why dont they use the Toilet? (1)-Other (please specify) Why dont they use the Toilet? (1)-Prefer Defecating in Open Why dont they use the Toilet? (1)-Toilet too far Why dont they use the Toilet? (1)-Unable to use - broken Why dont they use the Toilet? (1)-Used only during emergencies
How often do you get water?
365 days a year and 24/7 0.66 8.80 0.13 8.28 7.10 0 13.67 4.86 26.15 13.01
365 days a year but not 24/7 0.36 8.88 0.47 4.67 26.27 0 19.63 4.52 17.96 8.00
Not 12 months and not 24/7 1.09 2.19 0.55 4.92 42.08 0 7.10 3.28 8.74 2.19
Not 12 months but 24/7 1.19 7.58 1.19 1.78 17.98 0 14.56 7.28 46.06 2.67

On which occasions do the households not use the toilet?

Overall

In [136]:
col="What are the occasions family members not use the Toilet?"
options=get_options(col)
decompose_multiselect_answers_normalized(col,options)
value=[col+"-"+x for x in options]
hh_tlt_no_use_occ = raw_data.pivot_table(index='State',
                                        values=value,
                                        aggfunc=np.sum)
display_bar(hh_tlt_no_use_occ,title='',size=[16,9],pct=True)

In Districts

In [137]:
hh_tlt_no_use_occ_dw = raw_data.pivot_table(index='District',
                                        values=value,
                                        aggfunc=np.sum)
display_bar(hh_tlt_no_use_occ_dw,title='',size=[16,9],pct=False)

7. Usage of Bathroom in the households

Does anyone in family use the bathroom?

% of households where bathroom is used by any of the family member

By Availability of TBR

Why does the family not use the bathroom

Reason for not using the bathroom (by % of Households)

Overall

District Wise

Village Wise

Although the households do not use bathroom majorly due to lack of water, but what is the status of water supply in such households

In [146]:
raw_data[raw_data['Why do they not use the bathroom?-No water']==100]['How often do you get water?'].value_counts()
Out[146]:
365 days a year but not 24/7    822
Not 12 months but 24/7          155
Not 12 months and not 24/7      114
365 days a year and 24/7         66
Name: How often do you get water?, dtype: int64
In [147]:
hh_bth_no_usg_by_ws = raw_data[raw_data[par_col]=="No"].pivot_table(index=['How often do you get water?'],
               values = value,
               aggfunc=np.mean).round(2)
hh_bth_no_usg_by_ws
Out[147]:
Why do they not use the bathroom?-Bathroom is cold Why do they not use the bathroom?-No water Why do they not use the bathroom?-Other (please specify) Why do they not use the bathroom?-Prefer bathing in the open
How often do you get water?
365 days a year and 24/7 3.29 10.86 0 31.41
365 days a year but not 24/7 1.26 37.03 0 27.70
Not 12 months and not 24/7 4.57 52.05 0 10.05
Not 12 months but 24/7 4.58 30.88 0 30.68

Usage of Toilet and Bathroom among Family Members of Different Age Groups

'----------------------------------------------------------------------------------------------'
"Why don't adult male members (18-60 Years) use the Toilet?"
'----------------------------------------------------------------------------------------------'
"Why don't adult male members (18-60 Years) use the bathroom?"
'----------------------------------------------------------------------------------------------'
"Why don't adult female members (18-60 Years) use the Toilet?"
'----------------------------------------------------------------------------------------------'
"Why don't adult female members (18-60 Years) use the bathroom?"
'----------------------------------------------------------------------------------------------'
"Why don't elder male members (60+ Years) use the Toilet?"
'----------------------------------------------------------------------------------------------'
"Why don't elder male members (60+ Years) use the bathroom?"
'----------------------------------------------------------------------------------------------'
"Why don't elder female members (60+ Years) use the Toilet?"
'----------------------------------------------------------------------------------------------'
"Why don't elder female members (60+ Years) use the bathroom?"
'----------------------------------------------------------------------------------------------'
'Why dont boys (8-17 Years) use the Toilet?'
'----------------------------------------------------------------------------------------------'
"Why don't boys (8-17 Years) use the bathroom?"
'----------------------------------------------------------------------------------------------'
"Why don't girls (8-17 Years) use the Toilet?"
'----------------------------------------------------------------------------------------------'
"Why don't girls (8-17 Years) use the bathroom?"

Status of Water Supply and TBR in Villages

In [157]:
hh_wss_tbr_staus = raw_data.pivot_table(index=['District','Block','Gram Panchayat', 'Village'],
                               values = ['Total Households',
                                         "Caste (Avoiding asking. Ask only if doubtful)-SC",
                                         "Caste (Avoiding asking. Ask only if doubtful)-ST",
                                         "Caste (Avoiding asking. Ask only if doubtful)-OBC",
                                         "Caste (Avoiding asking. Ask only if doubtful)-General",
                                         'Do you have TBR?-Yes, I have TBR',
                                         'Do you have TBR?-Toilet only',
                                         'Do you have TBR?-No',
                                         'Did you ever have TBR?-No',
                                         'Do you NOW have tap connection (individual pipeline) to your house/ TBR?-Yes',
                                         'Do you NOW have tap connection (individual pipeline) to your house/ TBR?-No',
                                         'Do you have water supply to your house/TBR ?-Yes',
                                         'Do you have water supply to your house/TBR ?-No'],
                               aggfunc=np.sum)
Out[158]:

Household Survey Overview

In [171]:
create_download_link(hh_wss_status,title='Click to Download', filename='Household Survey Overview.csv',level='Village')
Out[171]: